Learning to de-anonymize social networks
نویسنده
چکیده
Releasing anonymized social network data for analysis has been a popular idea among data providers. Despite evidence to the contrary the belief that anonymization will solve the privacy problem in practice refuses to die. This dissertation contributes to the field of social graph de-anonymization by demonstrating that even automated models can be quite successful in breaching the privacy of such datasets. We propose novel machine-learning based techniques to learn the identities of nodes in social graphs, thereby automating manual, heuristic-based attacks. Our work extends the vast literature of social graph de-anonymization attacks by systematizing them. We present a random-forests based classifier which uses structural node features based on neighborhood degree distribution to predict their similarity. Using these simple and efficient features we design versatile and expressive learning models which can learn the de-anonymization task just from a few examples. Our evaluation establishes their efficacy in transforming de-anonymization to a learning problem. The learning is transferable in that the model can be trained to attack one graph when trained on another. Moving on, we demonstrate the versatility and greater applicability of the proposed model by using it to solve the long-standing problem of benchmarking social graph anonymization schemes. Our framework bridges a fundamental research gap by making cheap, quick and automated analysis of anonymization schemes possible, without even requiring their full description. The benchmark is based on comparison of structural information leakage vs. utility preservation. We study the trade-off of anonymity vs. utility for six popular anonymization schemes including those promising k-anonymity. Our analysis shows that none of the schemes are fit for the purpose. Finally, we present an end-to-end social graph de-anonymization attack which uses the proposed machine learning techniques to recover node mappings across intersecting graphs. Our attack enhances the state of art in graph de-anonymization by demonstrating better performance than all the other attacks including those that use seed knowledge. The attack is seedless and heuristic free, which demonstrates the superiority of machine learning techniques as compared to hand-selected parametric attacks. 3 4 Acknowledgments First and foremost, I would like to thank my supervisor Ross Anderson without whom this dissertation would have not been possible. He helped me at critical junctures, provided encouragement and valuable feedback, I owe him much gratitude for whatever I managed to achieve at Cambridge. I thank George Danezis for mentoring me during the initial stages of my PhD and teaching me how to …
منابع مشابه
Structure Based Data De-Anonymization of Social Networks and Mobility Traces
We present a novel de-anonymization attack on mobility trace data and social data. First, we design an Unified Similarity (US) measurement, based on which we present a US based De-Anonymization (DA) framework which iteratively de-anonymizes data with an accuracy guarantee. Then, to de-anonymize data without the knowledge of the overlap size between the anonymized data and the auxiliary data, we...
متن کاملAn Iterative Algorithm for Graph De-anonymization
The availability of social network data is indispensable for numerous types of research. Nevertheless, data owners are often reluctant to release social network data, as the release may reveal the private information of the individuals involved in the data. To address this problem, several techniques have been proposed to anonymize social networks for privacy preserving publications. To evaluat...
متن کاملInvestigating the Impact of Virtual Social Networks on Social Capital and Organizational Learning Capabilities with the Mediating Role of Helpful Activities
Introduction: The main topic of this research is to Investigating the Impact of Virtual Social Networks on Social Capital and Organizational Learning Capabilities with the Mediating Role of Helpful Activities. An important feature of social networks is that it has become a place to share knowledge, which in turn contributes to the quantitative and qualitative improvement of social capital. Thus...
متن کاملInvestigating the Impact of Virtual Social Networks on Social Capital and Organizational Learning Capabilities with the Mediating Role of Helpful Activities
Introduction: The main topic of this research is to Investigating the Impact of Virtual Social Networks on Social Capital and Organizational Learning Capabilities with the Mediating Role of Helpful Activities. An important feature of social networks is that it has become a place to share knowledge, which in turn contributes to the quantitative and qualitative improvement of social capital. Thus...
متن کاملSocial Network De-Anonymization and Privacy Inference with Knowledge Graph Model
Social network data is widely shared, transferred and published for research purposes and business interests, but it has raised much concern on users’ privacy. Even though users’ identity information is always removed, attackers can still de-anonymize users with the help of auxiliary information. To protect against de-anonymization attack, various privacy protection techniques for social networ...
متن کاملThe Effects of Social Networks on Nursing Students’ Academic Achievement and Retention in Learning English
Introduction: The use of modern virtual technologies in the process of teaching-learning is inevitable. One example is the use of virtual social networks in education. The purpose of this study was to examine the effects of social networking on nursing students’ academic achievement and retention in learning English. Methods: The pretest-posttest design with a control group was used in this qua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016